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ABSTRACT. — Knowing the distribution of species and the factors which determine it is a basic requirement 
for conservation efforts and developing management plans. Species distribution modelling (SDM) is a 
speedy and cost-effective tool for predicting species distributions, particularly for species in remote and 
inaccessible areas. This technique can be applied for example for poorly known small carnivore species 
in Southeast Asia, a biodiversity hot spot for mammals. SDM is used to gain ecological insights about the 
environmental factors that detennine species distribution, and helps to identify the areas where a species can 
occur and where conflicts may arise. However, recent advances in statistical theory and computer processing 
have made SDM a somewhat complex, diverse, and confusing area of research. This review presents an 
overview over the different techniques of species distribution modelling, and databases needed to answer 
applied questions in carnivore conservation, particularly in the tropics. We guide the ecologist through 
different methods which have become established approaches in the scientific literature and through freely 
available resources on abiotic data (environmental layers) for conducting such studies. We summarise the 
steps involved in predictive species distribution modelling, where the (carnivore) occurrence data come 
from different resources (such as museum records, voluntary surveys, systematic surveys, etc.). Finally, we 
explore the applications of such predictions in carnivore conservation. 
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INTRODUCTION 

With biodiversity in the tropics being in turmoil, mitigation 
of human-mediated impacts such as shifting land use or 
wildlife exploitation continues to be a clear focus for tropical 
conservation biology (Bradshaw et al., 2009). Carnivores and 
ungulates are a special target of conservation in the tropics 
and can serve as effective primers for designing conservation 
landscapes and management measures at the human-wildlife 
interface (Ray, 2010). Human activities often conflict with 
needs of carnivores. The question of how to integrate land 
exploitation and carnivore conservation in human-dominated 
landscapes poses one of the major challenges in conservation. 


Mammalian carnivores are undoubtedly a challenging group 
of organisms for conservation biologists. Large carnivores 
often cause problems at the livestock-wildlife interface 
and small carnivores are potential reservoirs of emerging 
infectious diseases, such as coronavirus responsible for 
recent SARS outbreaks (Bell et al., 2004; McLean et al., 
2005). However, small carnivores such as rodents play a 
crucial role in the functioning of the ecosystem, such as 
dispersers of seeds and controllers of pest species (Jordano 
et al., 2007; Roemer et al., 2009). 

Although the major challenges in mitigating the imminent 
threats to biodiversity in Southeast Asia are primarily 
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socio-economic in origin (Sodhi et al., 2004), modelling 
is an essential element of efforts to convert conservation 
in Southeast Asia into success (Fordham & Brook, 2010). 
Especially useful are models that capture the complexities 
and uncertainties underlying biological mechanisms that drive 
species distribution and abundance. Ideally, one would use 
models that integrate demography, the spatial structure of 
the landscape, and ecosystem processes, but they tend to be 
complex and require detailed data (Wiegand et al., 2004a; 
Kramer-Schadt et al., 2005). Therefore the first important step 
is to predict the potentially suitable area for the species and 
to understand the underlying environmental factors. This is 
the basic knowledge required for the effective management 
of natural resources for conservation. 

Species distribution modelling (SDM) has become an 
important tool in conservation ecology that allows approaching 
the basic question about the potentially suitable areas for a 
species and the underlying environmental factors (Elith & 
Leathwick, 2009). This statistical technique relates species 
occurrences or abundance with environmental information 
and/or spatial characteristics of those locations (Elith & 
Leathwick, 2009; Franklin, 2009). Predictions of SDMs are 
useful in identifying the core area for the conservation of 
species and can be the first step in management applications, 
such as site selection in reserve design (Zielinski et al., 2006), 
forecasting the response of species to environmental change 
and climate change (Carroll, 2007), studying large scale 
biogeographical issues such as geographic range contraction 
and taxonomic boundaries (Gates & Donald, 2000, Donald 
& Greenwood, 2001), invasive species biology (Peterson, 
2003; Herborg et al., 2007; Loo et al., 2007) and ecosystem 
studies (Ferrier et al., 2002; MacNally & Fleishman, 2004). 
Habitat maps derived from SDMs can also be used to identify 
conflict areas at the human-wildlife interface (Kanagaraj 
et al 2011a; De Angelo et al. 2013), to pin down areas of 
carnivore conservation concern, and to delineate and evaluate 
corridors in order to maintain connectivity between suitable 
habitats for long-term conservation of carnivore populations 
(Kanagaraj et al., 2011a). 

This review presents an overview over the different 
techniques of species distribution modelling and data bases 
needed to answer applied questions in carnivore conservation 
in the tropics. This is a specially challenging task because of 
limited financial resources, climatic challenges for researchers 
as well as material and inaccessibility of study areas. Several 
studies provide information on general aspects of SDM. They 
include reviews on technical aspects and methodological 
advice (Guisan & Zimmermann, 2000; Stauffer, 2002; 
Guisan & Thuiller, 2005; Richards et al., 2007; Schroder, 
2008), and historical and cross-disciplinary features including 
the review of Elith & Leathwick (2009). Franklin (2009) 
reviews and synthesises the vast literature on SDMs in her 
recent book, and Peterson et al. (2011) aim to offer a body 
of terminology and schemes by which to understand and 
discuss the complex relationships between ecological niches 
and geographic distributions of species. Here, we guide the 
ecologist through different methods which have become 
established approaches in the scientific literature and through 


resources on abiotic data (environmental layers) available 
freely for conducting such studies. We summarise the steps 
involved in predictive distribution modelling (Fig. 1), where 
the (carnivore) occurrence data come from different resources 
(such as museum records, voluntary surveys, systematic 
surveys, etc.). Finally, we explore the applications of such 
predictions in carnivore conservation. 

I: SPECIES OCCURRENCE DATA 

Although species distribution modelling algorithms currently 
use four different types of data, presence-only, presence- 
absence, presence-pseudoabsence, and presence-background 
(see sub-section Modelling algorithms), the types of species 
occurrence data required as input data in species distribution 
modelling are usually presence-only or presence-absence 
data. Documentation of presence or absence of a species at 
a survey site is often complicated because two groups of 
factors have to be taken into account: biological factors and 
factors related with the detectability of the species. Thus, to 
obtain species occurrence data of good quality one should 
consider these factors when planning the survey (Peterson 
et al., 2011). Different modelling approaches have been 
developed to deal with presence-only and presence-absence 
data. Presence-only data report known occurrences (presence) 
of species at a given location, but do not provide information 
about absences. Presence-only data may stem from different 
sources including direct sightings in transects, sign surveys, 
non-systematic surveys, incidental direct sightings, and 
museum specimens. 

The most reliable and accurate presence-only data sets are 
provided by direct sightings in transect counts that may use 
spotlights in night transect, trapping data, camera traps, and 
radio telemetry data (e.g. Durant et al., 2010; Pettorelli et 
al., 2010; Kanagaraj et al., 2011a). However, the effort of 
covering vast areas with such intensive techniques is large. 
This renders direct sighting methods ineffective for wide- 
ranging carnivores. Alternative methods include non-invasive 
sign survey methods which do not rely on capturing or direct 
observation of wide-ranging carnivores (Long et al., 2008). 
Sign surveys are similar to spotlighting and audio playbacks 
in terms of detection efficiency, precision, effort, and cost 
in landscape-scale surveys (Thorn et al., 2010). Sign survey 
data include the locations of indirect evidences such as 
tracks, faecal samples, depredation evidences, and scraps. 
Sign surveys are usually conducted by surveying along 
features that are likely to conserve carnivore sign such as 
dirt roads, dry water courses, and animal trails. They may 
use systematic sampling schemes (e.g., Smith et al., 1999; 
Thorn et al., 2010; Jhala et al., 2011; Kanagaraj et al., 2011a) 
or non-systematic surveys related to monitoring programs 
based on the collaboration of a network of volunteers and 
researchers (e.g. De Angelo et al., 2011). 

Although novel quantitative methods have been developed 
to identify tracks of different carnivore species (Smith et 
al., 1999; Jhala et al., 2010; De Angelo et al., 2010), tracks 
of many carnivore species can often not be distinguished in 
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the tropics even by highly experienced researchers. Mathai 
et al. (2010) observed that out of 14 small carnivore species 
recorded sign surveys were found useful for only three species 
(Malay Civet, Sun Bear, and otters) in a Southeast Asian 
forests. In contrast to regions outside tropical rainforests, 
where carnivore diversity is lower and/or substrates are more 
suitable for tracks (a leaf layer covering the ground), reliable 
identification of data based on tracks is only possible for 
species with diagnostic tracks such as the largest carnivores 
(e.g., tigers, bears). Similar difficulties arise in the tropics for 
the use of faecal samples as presence information because 
species identification from scats is difficult and supplementary 
evidences in the form of associated tracks and scraps are 
seldom available (e.g., Smith et al., 1999). Fast degrading 
DNA in tropical environments also complicates the successful 
application of molecular techniques (Goossens & Salgado- 
Lynn, 2013) even if they are developed specifically for 
different species (Fernandez et al., 2006; Flaag et al., 2009). 

Incidental direct sightings (e.g., De Angelo et al., 2011) 
or records of museum specimens, which are now widely 
available through networking of museum collections 


(Graham et al., 2004), are often a much more reliable 
source. Flowever, because they are usually not collected 
in a systematic manner these data sets are typically biased 
towards certain locations where sightings are easy (e.g., 
roads) or places favoured by specimen hunters (Reddy & 
Davalos, 2003; Vaughan & Ormerod, 2003; Phillips et al., 
2006). As we will see, non-systematically selected data must 
cover the environmental space sufficiently well (i.e., cover the 
environmental conditions where the species occurs and not 
occurs in the geographical study area), otherwise they may 
only reflect the species detectability (Phillips et al., 2009). 

Additionally, this type of data may include observer error 
because people with different levels of expertise (i.e., 
faculty, students, collection managers, amateur collectors, 
not equipped with GPS) may be involved in collecting 
and identifying specimens and the trading port rather than 
the actual location where the species was found is given 
as location. Another potential problem is that the historic 
presence records extracted from museum collections often 
show poor temporal correspondence with environmental 
variables like current land-cover classification (Anderson & 
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Fig. 1. From observation to conservation: Steps involved in species distribution modelling. 
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Martinez-Meyer, 2004; Gaubert et al., 2006). However, given 
all these potential pitfalls that may affect the accuracy of 
species distribution models, use of such data is often justified 
by the lack of more systematic survey data and widespread 
demand for mapped predictions (Elith & Leathwick, 2009). 
Because of their relatively low costs, presence-only data 
remain the major source of occurrence data for a species 
with large area requirements such as many carnivore species 
(Krishtalka & Humphrey, 2000). 

Presence-absence data comprise additional information on 
reliable absence locations. This is difficult because a site 
may be used but not noticed. Therefore, the effort to collect 
presence-absence data is usually larger than collecting 
presence only data. It often involves several repeated 
censuses at the same locations and may require, additional 
to the SDM, calculation of the detection probability of the 
species in order to avoid the management consequences of 
false negative observation rates in presence/absence surveys 
(Wintle et al., 2004). Presence-absence data are therefore 
usually collected using a form of stratified random sampling 
(e.g., McAlpine et al., 2006; Rhodes et al., 2006) or follow 
a systematic sampling scheme (e.g., Fernandez et al., 2006; 
Gormley et al., 2011). Sites, usually in the form of equally 
sized grid cells or administrative units (e.g., Jhala et al., 
2011), are selected from the study area and the presence or 
absence of the target species is then determined using the 
surveys. Field survey methods may include sign surveys as 
explained in presence-only data collection and camera traps. 
However, we would like to make the point that for almost 
none of the tropical carnivore species we can conclude on 
true absence due to the difficulties mentioned above. 


II: ENVIRONMENTAL DATA 

This section discusses the types of environmental data that 
are suitable for species distribution modelling and reviews 
data sources (Tables 1, 2). The most common environmental 
data that are used as predictor variables in statistical SDMs 
are related to land cover types, topography, and climate 
variables. However, remote sensing data that are not related 
to land cover classification are increasingly used for SDM. 

Environmental variables may comprise either continuous data 
(data that can take any value within a certain range, such as 
elevation or slope) or categorical data (data that are split into 
discrete categories, such as land cover types). Because one 
of the aims of SDMs is to project the model as map over the 
entire study area and because several SDM methods require 
“background data”, the values of an exploratory variable 
should be available for all grid cells in the entire study area. 
If this is not the case one may create a continuous surface 
using interpolation methods such as kriging (e.g., kriging 
interpolation using Geostatistical or Spatial Analyst tool in 
ArcGIS) or smoothing (e.g., spline and radial basis function 
smoothing methods in the R mgev, gstat and geoR libraries; 
e.g., Homing et al., 2010). 


Land cover variables. — Remote sensing (satellite) 
images have often been used indirectly through land cover 
classification for modelling species distributions (e.g., Schadt 
et al., 2002a; Pearson et al., 2004; see Table 1 for available 
sources of remote sensing data). Land cover classification can 
be derived from satellite images by applying unsupervised 
or supervised classification methods or a hybrid method 
which combines both (Richards, 1993; Bouman & Shapiro, 
1994; Ehsani & Quiel, 2010; see supplementary material on 
classification steps). However, some studies directly used 
seasonal NDVI (see below) composite images and clustered 
NDVI data derived from time-series NOAA AVHRR imagery 
as surrogates for land cover maps (Egbert et al., 2002). For 
example, Vaniscotte et al. (2009) and Lahoz-Monfort et al. 
(2010) used several spectral bands provided by satellite 
images as predictor variables in SDMs, and Wiegand et al. 
(2008) used the seasonal pattern of NDVI as an indicator 
for brown bear habitat quality. Egbert et al. (2002) found 
that direct NDVI data performed as well as or better than 
topographic land cover data. However, when climate data 
were added to the model, land cover data performed slightly 
better. 

Land cover types are categorical variables which cannot 
be used directly in several modelling algorithms such as 
BIOCLIM, DOMAIN, ecological niche factor analysis 
(ENFA), and Mahalanobis (see below). Hence, it is useful 
to transform them into a set of quantitative neighbourhood 
variables that yield the proportion of a given land cover type 
within distance r (e.g., De Angelo et al., 2011; Kanagaraj 
et al., 2011a; see below). Based on land cover one can also 
calculate several fragmentation metrics such as the number 
of forest patches, mean patch size, the Euclidean nearest 
neighbour distance, forest patch density or largest patch 
index, using software such as FRAGSTATS (Jaeger, 2000; 
McGarigal et al., 2002). 

Topographic and other variables. — A digital elevation 
model (DEM), which can be obtained from the SRTM 
program (see Table 2), is often processed in a GIS software 
to generate a number of topography related variables such 
as elevation, slope, aspect, surface area, surface ratio index, 
topographic wetness index, vertical distance from the channel 
network or hydrological information such as watersheds and 
water flow direction (e.g., Jenness, 2004; Hengl et al., 2009; 
Vaniscotte et al., 2009; De Angelo et al., 2011; Kanagaraj 
et al., 201 la,b). However, since elevation itself can be a 
proxy for climatic conditions, it should rather be exchanged 
by climatic predictors in models that use both elevation and 
climate variables (see also the section on multicollinearity 
of predictors “Problems encountered: spatial autocorrelation 
and multicollinearity”). 

Vegetation metrics from multi-spectral satellite imagery, 
such as the normalised difference vegetation index (NDVI) 
(Tucker, 1979) and tasseled cap transformation (TCT) 
metrics (known as greenness, wetness, and brightness), have 
been employed in previous carnivore species-environment 
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modelling studies (Mace et al., 1999; Carroll et al., 2001; 
Alexander et al., 2006; Wiegand et al., 2008; Vaniscotte 
et al., 2009). The NDV1 is the most commonly used and 
remote sensed variable in SDMs which is available for many 
years for most areas. It is based on the normalised ratio 
of the reflectance in the two spectral bands, near infrared 
and visible red, and quantifies the difference between 
photosynthetic activity related absorption in the visible range 
and reflectance in the near-infrared which is related to electro¬ 
magnetic emission by plants. NDV1 is therefore correlated 
with vegetation biomass and has been used for quantifying 
productivity and above-ground biomass of ecosystems that 
influence species living in seasonal environments, local 
abundance of individuals and species distribution (Brown, 
1988; Oindo & Skidmore, 2002; Seto et al., 2004; Wiegand 
et al., 2008; Vaniscotte et al., 2009). 

An integrated normalised vegetation index (INDVI) can be 
calculated as an index of vegetation productivity from the 
NDVI, if the field survey has been conducted for several 
months or seasons (e.g., Pettorelli et al., 2006; Singh & 
Milner-Gulland, 2011). However, an enhanced vegetation 
index (EVI) can also be computed from the satellite imagery 
as the NDVI suffers from saturation when vegetative biomass 
is high and sensitive to the canopy background in open 
forested areas (Huette et al., 2002; Vaniscotte et al., 2009). 

Bioclimatic variables (e.g., variables related to temperature 
and precipitation) can be derived from the WorldClim 
database (Hijmans et al., 2005). Global climate models have 
been used to generate scenarios of future climates and to 
simulate climatic conditions since the end of the last glacial 
period. Bioclimatic models are used to predict geographic 
ranges of organisms as a function of climate, and are widely 
used to forecast range shifts of organisms due to climate 
change, predict the eventual ranges of invasive species, or 
to infer paleoclimate from data on species occurrences (see 
Jeschke & Strayer, 2008 for an extensive review). 

The environmental variables discussed so far were mostly 
related with factors characterising the natural environment. 
However, for carnivores, human induced mortality is an 
important factor which does often not correlate with features 
of the natural environment (Woodroffe & Ginsberg, 1998; 
Naves et al., 2003) and may therefore create attractive 
sinks (Delibes et al., 2001). Therefore, it is important to use 
environmental variables associated with human activities that 
have direct impact on carnivore distribution, abundance, and 
survival, such as intensities of hunting/poaching, livestock 
grazing and food, fodder, and fuel wood collections. In 
absence of such measures, proxy variables that quantify 
the potential human disturbance such as human population 
density and presence of roads can be used in the analysis. 
Disturbance variables such as digital vector layers of 
transportation and population densities or other thematic 
layers such as drainage systems and boundaries can be 
obtained with the scale of 1:1,000,000 from the Global Map 
Data project and Digital Chart of the World Data Server 
(Table 2). Two different measures can be calculated from 
these vector layers using a GIS or modelling software (e.g., 


“Spatial Analyst” in ArcGIS and “Circular Analyst” in open 
source Biomapper): 1) variables calculating the straight line 
distance to the closest target location (e.g., town, road, river) 
in the target layer; and 2) variables calculating the frequency 
of cells (grids) occupied by the target locations in a circle 
of radius (see section Neighbourhood variables) around the 
focal cell in the target layer. 

Neighbourhood variables. — The grain of environmental 
variables is often small and is not necessarily related to 
the spatial scales at which the target (carnivore) species 
perceives the landscape and at which resources need to be 
available (Schadt et al., 2002b; Naves et al., 2003; Wiegand 
et al., 2008; Kanagaraj et al., 2011a). Hence, it is often 
useful to transform the original categorical land cover (or 
other environmental) variables into a set of neighbourhood 
variables. A neighbourhood variable is the mean value of 
the target variable within a specified neighbourhood radius 
around the target cell. Because the critical scale at which the 
species perceives its environment is often not known a priori, 
variables need to be constructed for several neighbourhood 
radii. They should cover spatial scales larger than the home 
range size of the target species (Schadt et al., 2002b; Naves 
et al., 2003; Wiegand et al., 2008; De Angelo et al., 2011; 
Kanagaraj et al., 2011a). 

Ill: STEPS INVOLVED IN SDM 

Species and environmental data used for modelling are 
usually stored in a Geographic Information System (GIS). 
Species data, i.e., sites where a species has been observed 
(or not observed), are usually stored as point localities 
(termed point vector data). Environmental variables are 
stored either as point vector data (e.g., percentages of trees 
or grass cover measured at sites where the species has 
been observed), as polygon layer defining an area (termed 
polygon vector data; e.g., areas with different soil types) 
or as a grid of cells (termed raster data; e.g., elevation or 
land cover types derived from remote sensing). For use in 
a species distribution model, it is common to reformat all 
environmental data to a raster grid. 

The cells containing the species data and the environmental 
data are used to build the SDM. After the statistical model is 
constructed, it is evaluated using independent species data, 
cross-validation or other evaluation measures (reviewed in 
Franklin, 2009 and Elith & Leathwick, 2009). Then the 
occurrence of species in the entire study area is predicted 
with the mathematical formula given by the statistical model 
using the environmental variables identified by the final model 
and mapped by returning probability values of occurrence 
or habitat suitability for each raster cell (Fig. 1). 

Modelling algorithms. —A number of alternative modelling 
algorithms have been developed to classify the probability of 
species’ presence (and absence; or abundance) as a function 
of a set of environmental variables. The quality of species 
distribution models depends on the quality, grain and extent 
of the data (e.g., Trivedi et al., 2008; von dem Bussche et 
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al., 2008), and on the effectiveness of the statistical model. 
The differences between model algorithms in dealing with 
several technical issues should be considered when selecting 
a method to apply. Major issues are: 

1. The ability of different algorithms to deal with the 
different species data types 

2. The ability of different algorithm to deal with categorical 
environmental variables (i.e., ENFA does not work with 
categorical variables) 

3. Complex non-linear responses of species in multi¬ 
dimensional environmental space 

4. Presence of spatial and temporal autocorrelation 
(Legendre, 1993; Barry & Elith, 2006; Dormann et al., 
2007) 

5. Non-equilibrium distribution of species (e.g., population 
extension or reduction) 

6. Non-stationarity, i.e., the variation in modelled 
relationships over space and/or time (Osborne et 
al., 2007), as the model assumes that the effects of 
environmental variables are fixed and universal (e.g., 
Hothorn et al., 2011) 

in the following we will discuss how the different algorithms 
of SDM deal with these issues. 

Presence-only methods. — There are several approaches 
that use presence-only data for species distribution modelling 
(see Table 3 for the list and abbreviations). Models used 
to fit presence-only data predict the relative likelihood of 
species presence at a site, or the relative habitat suitability, 
but not the actual probability of species presence which 
can only be estimated by presence-absence methods (Elith 
& Leathwick, 2009; Franklin, 2009). The presence-only 
methods BIOCLIM, DOMAIN, LIVES, and Mahalanobis, 
use only presence records, and all other methods require 
presence and some form of absence from a background 
(Table 3). They either compare use with availability or are 
based on “pseudo-absences” to apply methods of presence- 
absence modelling. In short, we can distinguish three types 
of presence-only methods: 

1. Methods that rely solely on presence (occurrence) 
data (e.g. BIOCLIM, DOMAIN and LIVES): 

Envelope techniques (e.g., BIOCLIM, DOMAIN, 
LIVES, and Mahalanobis) fit a minimal envelope in a 
multidimensional space to presence-only data and do 
therefore not require background data. 

2. Methods that require presence and a background 
sample of absence data (e.g., ENFA or Maxent): In 

most techniques, the recursive selection of an absence 
sample from the background is automatically performed 
in the modelling software (e.g. Maxent, GARP, and 
Biomapper (ENFA); see Table 3 for references where 
these methods have been applied in carnivore habitat 
modelling studies). 

3. Methods that require presence data and pseudo¬ 
absences from the study area: Pseudo-absences can 
be generated in several ways—They can be selected: 


A) at random from the study area (e.g., Stockwell & 
Peters, 1999; Hirzel et al., 2002; Rosalino et al., 2010); 

B) in two steps, first generating an SDM with a group 
discriminative technique and random pseudo-absences, 
and then obtaining pseudo-absences only from the areas 
predicted to have higher suitability values (Zaniewski et 
al., 2002); C) based on a weighting criterion (e.g., Engler 
et al., 2004; Chefaoui & Lobo, 2007, 2008; Acevedo 
& Cassinello, 2009; Kanagaraj et al., 2011a); and D) 
using distribution data from species—called auxiliary 
species—with similar environmental requirements to 
the species being studied (Liitolf et al., 2006). Another 
effective way to generate the pseudo-absences is based 
on a strategy called target-group absences (Mateo et 
al., 2010a). Important differences between the pseudo¬ 
absence approach and the background approach are 
that the pseudo-absence methods do not include 
occurrence localities within the set of pseudo-absences 
and pseudo-absences can be selected in a GIS and the 
number and the location criteria can be controlled by the 
researcher. In principle, any presence-absence method 
can be implemented using pseudo-absences (e.g., ANN, 
BIOMOD, BRT, DCM, GLM, GAM, and MARS; see 
Table 3). 

Regarding their performance, a study by Elith et al. (2006) 
identified new tools such as MARS and Maxent to be better 
suited than the well-established and more widely used 
modelling methods such as DOMAIN, GARP, and BIOCLIM. 
Among the presence-only and presence-absence methods, 
Maxent has become a popular method that has performed 
generally well (Elith et al., 2006; Philips et al., 2006, 2009; 
Guisan et al., 2007; Hernandez et al., 2008), especially when 
only a small sample of observations is available (Hernandez 
et al., 2006; Pearson et al., 2007; Wisz et al., 2008; Franklin, 
2009). Also, the Maxent model algorithm considers linear, 
non-linear, and interaction effects (Phillips et al., 2006). A 
recent study by Hoffman et al. (2010) showed that there 
was little difference in the performance of presence-absence 
models (GAM, GLM or logistic regression) compared to the 
presence-only models, and that relatively accurate models 
can be generated using Maxent and/or discrete choice models 
(DCM). They noted that DCMs are conceptually similar to 
other resource selection functions (RSFs) such as logistic 
regression where data is collected from sites where the 
species is present and absent (Manly et al., 2002; Keating 
& Cherry, 2004), however, unlike logistic regression in 
RSFs, absence sites in DCMs are selected at random and 
confined to a “choice set”, which can be controlled by the 
researcher. Often, these presence-only methods have been 
implemented in user-friendly software that is free and easy 
to obtain (Table 3). 

Presence-absence methods. — Presence/absence data can be 
used to predict the actual probability of species occurrence. 
Example methods include regression methods such as 
GLM, GAM, BRT, and MARS, and DT and ANN (also 
the approaches mentioned in type 3 presence-only methods; 
Table 3; see Franklin (2009) for more details). Generalised 
linear models (GLMs) and generalised additive models 


92 


Table 3. Approaches for modelling species resource selection, habitat suitability and distribution. Information compiled mostly from Franklin (2009) and Elith & Leathwick (2009). 
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(GAMs) have been used extensively in species distribution 
modelling, including carnivore habitat modelling studies 
(e.g., Mladenoff et al., 1995; Palma et al., 1999; Schadt et 
al., 2002b; Woolf et al., 2002; Naves et al., 2003; Hoving 
et al., 2004; Muntifering et al., 2006; Varela et al., 2009; 
Conde et al., 2010; Rosalino et al., 2010; Zielinski et al., 
2010; Kanagaraj et al., 2011a). They are the preferred tool 
because of their strong statistical foundation and ability to 
realistically model ecological relationships and produce 
robust models (Austin, 2002), and they performed better than 
classification trees and GARP (Meynard & Quinn, 2007). 
GLMs fit parametric terms, usually some combination of 
linear, quadratic, and/or cubic terms, whereas GAMs use 
non-parametric, data-defined smoothers to fit non-linear 
functions and are more capable of modelling complex 
ecological response shapes than GLMs (Yee & Mitchell, 
1991; Elith et al., 2006). 

Models such as autoregressive models (AR), generalised 
linear mixed-effect models (GLMM; Rhodes et al., 2009; 
Klar et al., 2008), generalised estimating equations (GEE), 
and spatial filtering can handle spatially autocorrelated species 
data (see Table 6.2 in Franklin, 2009 for details). However, 
a new modelling framework described by Hothorn et al. 
(2011) handles non-linearity, interaction, spatiotemporal 
autocorrelations, and non-stationarity in a single non- 
parametric modelling framework (model-based boosting). 
Presence-absence and abundance data can be used in this 
framework. 

When telemetry data is available, one can estimate resource 
selection functions (RSFs), which compare used (telemetry 
locations) with available habitat (Manly et al., 2002). RSF 
models can also be developed for sign survey data when the 
sampling protocol represents a true presence-absence design 
(see Alexander et al., 2006; Carroll & Miquelle, 2006). 
These RSF models have been used in several carnivore 
habitat modelling studies (e.g., Apps et al., 2004; Carroll & 
Miquelle, 2006; Klar et al., 2008; Rozylowicz et al., 2010). 

Occupancy modelling. — Since the utility of predictive 
distribution models for designing reliable conservation 
planning depends on our capacity to obtain the best 
quality data possible for developing habitat models, it 
is also particularly important to explicitly account for 
imperfect detectability for studies of rare and elusive 
species (Thompson, 2004). The detection-nondetection 
data with repeated independent survey data are suitable for 
occupancy estimation and modelling (MacKenzie & Royle, 
2005), however, detection probability can be estimated 
directly from single surveys of multiple trail segments 
following newly developed sign survey protocols (Thorn, 
2009; Hines et al., 2010). The likelihood-based occupancy 
modelling (MacKenzie et al., 2002; Mackenzie & Bailey, 
2004; MacKenzie et al., 2006) permits the simultaneous 
estimation of site occupancy and detectability and allows to 
produce predictive probability of occurrence models from 
detection-nondetection data of wide-ranging carnivores 
(e.g., Gardner et al., 2010; Long et al., 2011; Sollmann et 
al., 2011; Sunarto et al., 2012). 


Ensemble forecasting. — According to Araujo & New (2007), 
a forecast ensemble can be defined as “multiple simulations 
(copies) across more than one set of initial conditions 
(IC), model classes (MC), parameters (MP), and boundary 
conditions (BC)”. Although this method has been widely used 
in a variety of other fields of research such as economics, 
meteorology, climatology, etc., it has only been recently 
applied in ecological studies for bioclimatic modelling of 
species distributions (see Araujo & New, 2007 and references 
therein). Though several SDM modelling techniques such 
as ANNs, GARP, MAXENT, RFs, etc., incorporate the 
notion of this method (see Table 1 in Araujo & New, 2007), 
these techniques do not consider all possible combinations 
of IC, MC, MP, and BC, providing an unclear picture of 
the potential model uncertainties (Araujo & New, 2007). 
Thuiller et al. (2009) developed a software platform called 
‘BIOMOD’ for ensemble forecasting methods that enables 
the treatment of a range of methodological uncertainties in 
models and allows users to test several modelling techniques, 
project species distributions into different climate or land 
use change scenarios and dispersal functions. 

Problems encountered: Spatial autocorrelation and 
multicollinearity. — Spatial autocorrelation describes the 
phenomenon that species occurrences that are close in space 
are more similar to each other than the occurrences that are 
further away from each other. This lack of independence in the 
data can lead to pseudo-replications that will give variables 
a higher significance. Serial autocorrelation, especially in the 
GPS telemetry data, can lead to the inaccurate estimation 
of RSFs (Koper & Manseau, 2009). These problems may 
be solved by achieving independence in the data set by 
destructive sampling (Way et al., 2004), but it may require 
dropping as many as 95% of data collected (Saher, 2005; 
Koper & Manseau, 2009). One way to assess the extent of 
spatial autocorrelation in the presence/absence data is to look 
at correlograms of the data and of the residuals (Cliff & Ord, 
1981; Bjornstad & Falck, 2001; Dormann et al., 2007). A 
spline correlogram of the raw (presence/absence) data and 
the residuals of the regression model can be produced to 
investigate the spatial autocorrelation (e.g., Rhodes et al., 
2009; R package ‘ncf’). Dormann et al. (2007) distinguish 
four approaches to address spatial autocorrelation in linear 
models: autocovariate models, spatial eigenvector mapping, 
generalised least squares, and generalised estimation 
equations. The modelling framework described by Hothorn 
et al. (2010) handles spatial and temporal autocorrelation in 
presence-absence and abundance data sets. 

Spatial autocorrelation in the presence (presence-only) 
localities can also be minimised through the use of a 
constrained random split of sampled locations forcing all pairs 
of points below a threshold distance to split dichotomously 
into the training and the test sets (Parolo et al., 2008). 
Another option is to overlay a grid with a cell size equal to 
the home-range of the target (carnivore) species and randomly 
select a single point from each cell that contains more than 
one record (Sattler et al., 2007; De Angelo et al., 2011; 
Kanagaraj et al., 2011a). Alternatively, the background area 
can be also manipulated (e.g., Kramer-Schadt et al., 2013). 
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An important issue to be considered before fitting regression 
models is to check whether there is high collinearity 
between the explanatory variables (Graham, 2003). The 
problem with collinearity is that if one variable depends 
on the others, an ecological interpretation impact of a given 
variable on habitat suitability is not possible. Collinearity 
can be identified by looking at the pairwise Spearman rank 
correlation coefficient between explanatory variables. As 
a rule of thumb, correlation coefficients between pairs of 
variables with magnitude of |r| >0.7 indicate high collinearity 
(Dormann et al., 2012). Another way to check the collinearity 
is to calculate the variance inflation factors (VIFs) for each 
variable. A cut-off value of 5 or 3 can be used to remove 
collinear variables (e.g., Zuur et al., 2009, chapter 16, R 
package ‘AED’). For more available methods we recommend 
readers to consult Dormann et al. (2012) who tested several 
available collinearity diagnostics methods and evaluated their 
performance based on a simulation study. 

Problems encountered: Complex non-linear response 
and interactions. — Environmental variables that may act 
nonlinearly can be identified by plotting the explanatory 
variable against the response. Non-linearity of the continuous 
predictor variable can be included in the model by squaring it 
or using its n th polynomial. This is the simplest and generally 
sufficient way to include non-linear effects in the model 
(Dormann, 2011). Another way to avoid this problem is to 
use smoothing models such as GAM (Zuur et al., 2009). 
However, the influence of explanatory variables on species 
occurrence can also be non-additive and can be estimated 
by random forests (Cutler et al., 2007) or boosted regression 
trees (Moisen et al., 2006; De’ath, 2007; Elith et al., 2008; 
Zurell et al., 2009; Hothorn et al., 2011). 

Another potential problem in this context can be caused by 
interaction effects of explanatory variables. An interaction 
effect occurs if the effect of one variable depends on the 
level of the other variable. Thus, including only the main 
effect of the variable (i.e., the effect of each variable, 
independent of the other variables) in the model may lead 
to misinterpretation if interaction effects are present. 

Problems encountered: Non-equilibrium distribution 
of species. — SDMs are fundamentally static in nature 
and usually require the assumption that the species is in 
equilibrium with its environment. That means that the species 
is present at locations that show a good suitability and the 
species is absent at locations that show a low suitability. 
However, this assumption is often not met. For example, 
many endangered species are showing range reduction due 
to increasing human disturbance. In this case one may adapt 
the data scheme and use only absence locations which are 
close enough to the known species presences assuming that 
the species could potentially use this site. This allows then 
for identifying potential habitat which is currently outside 
the dispersal range of the species. However, if these areas 
would be included as unsuitable (because they were unused) 
a severe bias may be the result. 


Ironically, climate change is one of the most important 
motivations for the booming use of SDMs. However, if 
the environment changes, a species is likely not to be in 
equilibrium with its environment and the prediction accuracy 
of the species distribution will severely be affected (Zurell 
et al., 2009; Peterson et al., 2011). The spatial and temporal 
variability in the environment resulted from changing climate 
must be accounted for in SDMs (Zurell et al., 2009). To 
correctly predict and understand range shifts of species the 
dynamic nature of populations, also dispersal at the leading 
edge and extinction or persistence at the trailing edge of the 
range shift should be incorporated into the SDMs (Zurell 
et al., 2009). 

Model selection and validation. — In the multiple regression 
context model subset selection in species distribution 
modelling has two distinct purposes: 1) to find the single best 
‘predictive’ model; and 2) explanation of causal relationships 
between the dependent variable and the independent variables 
using an explanatory approach (Mac Nally, 2000). In the first 
approach a quantitative model is desired for ‘prediction’. In 
the second case, however, no quantitative model is desired, 
but ‘further studies and experiments may be suggested for 
testing the causal nature of relationships’ (Mac Nally, 2000). 
The predictive model can be used to make predictions of 
the current and future distribution of species based on 
measurements of a few explanatory variables. An appropriate 
selection criterion (e.g., Akaike information criterion [AIC] or 
Bayesian information criterion [BIC]) can be used to find the 
single best ‘predictive’ model. The explanatory approach can 
be used to test predictions of models, i.e., testing whether the 
outcomes of the explanatory approach are in agreement with 
the ‘predictive’ approach, and to develop new insights and 
design new research (Mac Nally, 2000). A suitable method 
for the explanatory approach to explore potentially causal/ 
explanatory relationships between the dependent variable 
and the independent variables is hierarchical partitioning 
(Mac Nally, 2000). 

Regression methods provide sophisticated tools of model 
selection that allow the researcher to identify the hypothesis 
and the single best ‘predictive’ model that receives most 
support from the data, given a predefined array of competing 
hypotheses about the environmental variables that influence 
habitat suitability (Burnham & Anderson, 1998; Johnson & 
Omland, 2004). Using the accumulated knowledge of ecology 
of the target species provides a basis for guided a priori 
selection of explanatory variables that may influence species 
occurrence (e.g., Fernandez et al., 2003; Fernandez et al., 
2006; Klar et al., 2008; Kanagaraj et al., 2011a). Information- 
theoretic methods can then be used for model selection 
where model fit is assessed using an appropriate criterion 
(e.g., AIC or BIC) that balances fit against model complexity 
(i.e., number of parameters). The most parsimonious model 
is usually selected based on lowest AIC value (Burnham 
& Anderson, 1998). Regularisation methods for logistic 
regression, such as lasso and ridge, have also been proven 
as useful and risk-averse model strategies especially at small 
sample sizes (Reineking & Schroder, 2006). 
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Model diagnostics is an important step to check the quality 
of the model. It is important to investigate the model 
residuals to be confident that model assumptions are valid 
(e.g., plotting the residuals against fitted values to verify 
homogeneity, making a histogram of the residuals for 
normality and residuals against each explanatory variable to 
check independence; Zuur et al., 2009). Additionally, model 
residuals should be tested for spatial autocorrelation. Several 
methods are available to correct spatial autocorrelation in 
the occurrence data (see Dormann et al., 2007). 

Performance of the fitted model can be divided into calibration 
and discrimination (Pearce & Ferrier, 2000) and quantified 
by deriving several statistical measures by comparing 
the model predictions with field data. Both performance 
criterions should be used to evaluate the performance of 
the fitted model (Reineking & Schroder, 2006), as they 
measure different aspects of model performance (Harrell, 
2001). Calibration can be quantified using a calibration curve 
(e.g.,Reineking & Schroder, 2006) and by calculating the 
explained deviance (e.g., Zuur et al., 2009). The threshold 
independent measure AUC (area under curve) is the most 
commonly used measure that could be used to evaluate 
model discrimination for presence-absence data (Fielding & 
Bell, 1997) whereas k-fold cross validation, can be used for 
presence-available models (Boyce et al., 2002). The AUC- 
value, estimated by calculating the area under a receiver 
operating characteristic (ROC) curve, is 0.5 in the random 
model case and 1 if the classification is perfect. There are 
other performance measures such as classification error 
and kappa, which are based on binary predictions. For this 
purpose, the probability predictions from the fitted model are 
converted into presences and absences using an appropriate 
threshold value (Liu et al., 2005). A confusion matrix can 
then be used to calculate the measures such as commission 
and omission error, kappa, etc. 


IV: APPLICATION OF SDMS IN CARNIVORE 
CONSERVATION PLANNING 

Predictions obtained from SDMs can be used in a range of 
conservation and management applications. First, in a static 
manner, single species distribution or habitat suitability maps 
can be used to evaluate the currently available area for the 
species under study. This allows addressing basic questions 
such as if the area is large enough to sustain a population, are 
the best areas those which may bear the largest conflicts with 
human use, or which areas may be suitable for restoration? 
Additionally, the predicted habitat maps can be used to assess 
functional landscape connectivity for inter-patch dispersal 
and to identify barriers to species movement at regional 
scales (e.g., Beier et al., 2006; Kanagaraj et al., 2011a). In 
this context, the least-cost modelling approach (reviewed 
in Sawyer et al., 2011) has been widely used for designing 
wildlife corridors. Least-cost path models use a GIS raster 
and assign weights (or landscape resistance values) to 
landscape elements such as certain land cover types or roads 
that describe the cost of moving through this element. The 
least-cost algorithm then searches the path with the lowest 


cost. Instead of assigning weights to landscape elements one 
can also use directly habitat suitability maps (Clevenger et 
al., 2002; Chetkiewicz & Boyce, 2009; Huck et al., 2010). 
The weighted input GIS maps can be generated based on 
techniques that include expert-based models (Schadt et al., 
2002a; Singleton et al., 2002; Wikramanayake et al., 2004; 
Johnson & Gillingham, 2005; Beier et al., 2006; Beier et 
al., 2009), compositional and Euclidean and Mahalanobis 
distance analyses (Clevenger et al., 2002; Kautz et al., 2006), 
RSFs (Chetkiewicz et al., 2006; Chetkiewicz & Boyce, 
2009), weights-of-evidence (Kindall & Van Manen, 2007) 
and profile methods such as ENFA (Huck et al., 2010). It 
results in connections between delineated areas that contain 
the least amount of barriers or unfavourable conditions 
and are therefore assumed to represent the most promising 
wildlife corridors (Sawyer et al., 2011). 

Distribution or habitat suitability maps can also be used to 
identify conflict areas at the human-wildlife interface or to 
pin down areas of carnivore conservation concern. Predicted 
habitat suitability values have been used in reserve selection 
(e.g., Margules & Nicholls, 1987; Williams & Araujo, 2002; 
Zielinski et al., 2006). Algorithms such as MARXAN can be 
used to identify priority habitat areas for individual species, 
and for combined species groups, and to compare these 
areas with existing reserves in order to identify habitats that 
do not align with existing reserves (Zielinski et al., 2006). 

In biodiversity research, habitat suitability models developed 
for individual species can be used to assess intra-guild 
competition and reveal differences in habitat use patterns 
within the carnivore community (e.g., Alexander et al., 2006; 
May et al., 2008; Durant et al., 2010; De Angelo et al., 2011). 
They can be used to calculate the degree of overlap among 
species and patch sizes in order to identify the necessary 
scales for regional zoning for conservation and management 
implications (May et al., 2008; De Angelo et al., 2011). For 
example, BIOMAPPER software tools can be used for direct 
species comparison (e.g., Durant et al., 2010; Pettorelli et 
al., 2010) or to understand how ecologically similar species 
respond to anthropogenic transformations of the landscape 
(De Angelo et al., 2011) and to estimate traditional niche 
breadth (Levins’ standardised index) and overlap indices 
(Pianka’s overlap index and Lloyd’s asymmetric overlap 
index) by applying a discriminant analysis (Hirzel et al., 
2008; Qi et al., 2009; Simard et al., 2009; De Angelo et al., 
2011). Environmental favourability functions derived from 
habitat suitability models (Real et al., 2006) enable direct 
model comparison and combination when more than one 
species is involved (e.g., Estrada et al., 2008; Real et al., 
2008; Real et al., 2009), for example, directly comparing 
the degree of favourability for a rare predator and a more 
common prey (Real et al., 2009). 

When different factors determine carnivore mortality (often 
human disturbances) and reproduction (often natural habitat 
factors; Woodroffe & Ginsberg 1998; Naves et al., 2003), 
a two-dimensional habitat model can be developed where 
one axis describes suitability for reproduction and the 
second axis survival. Ideally, each axis would be constructed 
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from data about reproduction and mortality (e.g., Nielsen 
et al., 2006; Falcucci et al., 2009). However, since such 
information is often not available, Naves et al. (2003) 
proposed to use species presences-absence data, but with 
different hypotheses on the environmental variables where 
the reproduction model would be based on variables related 
to food resources and cover, and the survival model would 
be based on variables related to human disturbances. This 
allows a more sophisticated look at habitat suitability than 
possible with traditional one-dimensional approaches that 
rank habitat suitability from unsuitable matrix to poor and 
to good. A two dimensional habitat model can categorise a 
landscape into demographically motivated categories that 
leads to the identification of critical areas for management 
such as attractive sinks (i.e., good natural suitability but high 
levels of human disturbance; Naves et al., 2003; Kanagaraj 
et al., 2011a; De Angelo et al., 2013) and refuge areas (i.e., 
poor natural suitability but low levels of human disturbance). 

When the inclusion of land cover and other human disturbance 
is not possible (e.g., because of temporal differences in the 
occurrence data and landscape features), predictions can be 
overlaid with current habitat quality and disturbance status 
(e.g., current land use, location of protected areas, and human 
population density) to estimate the conservation status of the 
species (e.g., Papes & Gaubert, 2007) or to assess habitat 
loss (Lopez-Arevalo et al., 2011). For the flat-headed cat 
in Borneo, Wilting et al. (2010) showed the discrepancy 
between potentially suitable habitat and habitat loss due 
to palm oil plantations and concluded that connectivity 
maybe only possible via river beds; thus a conservation goal 
resulting from this modelling exercise would be restoring 
riparian vegetation. 

Second, in a dynamic manner habitat suitability maps can be 
used as the spatial basis for simulation modelling assessing 
habitat connectivity and population viability, e.g., for 
reintroduced or expanding carnivore populations in human- 
dominated landscapes (Wiegand et al., 2004a,b; Kramer- 
Schadt et al., 2005; Imron et al., 2010; Marucco & Mclntire 
2010). Species distribution maps can also be included in 
land use development simulation scenarios that include the 
economic aspect. To link biodiversity to monetary values is 
a crucial aspect in tropical biodiversity conservation. In this 
context, Koh & Ghazoul (2010) developed spatial palm oil 
expansion models focussing either on agricultural expansion, 
forest protection, or carbon conservation in Indonesia. For 
this, they included species biodiversity maps to assess the 
degree of biodiversity loss due to each palm oil expansion 
scenario. We conclude that SDMs are an important and 
multidimensional prerequisite in carnivore conservation. 
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SUPPLEMENTARY MATERIAL - CLASSIFICATION OF REMOTE SENSING DATA 

In classification, the satellite image has been processed to put each pixel into a category by segmenting each pixel separately or by 
segmenting the image into regions. The result of this classification is a vegetation map, land use map, or other map grouping related 
features. In unsupervised classification, the classification is performed using an algorithm, e.g., the ISODATA (Iterative Self-Organising 
Data Analysis Technique) clustering method, on satellite images that groups similar pixels into spectral classes. The number of spectral 
classes, which is decided by the researcher, is usually more than the desired number of categories (e.g., land cover types). In the next 
step, these spectral classes are assigned to one of the land cover types based on the ground-truthing points collected from the field (e.g., 
Wikramanayeke et al., 2004; Kanagaraj et al., 2011a). 

Ground-truthing can be conducted in the sites selected by a stratified random sampling procedure recording the land use and structural 
vegetation parameters. These points can be divided in two parts: training sites and test sites. The land cover type for an unsupervised spectral 
class may be decided by visual interpretation of training ground-truthing points or the results of unsupervised classification can be used in 
the supervised classification method to determine land cover types (Richards, 1993). In the supervised classification method, the computer 
uses an algorithm (e.g., sequential maximum a posteriori, parallelepiped, minimum distance, maximum likelihood, Mahalanobis distance, 
etc.) to automatically classify the satellite image into desired number of land cover classes using the training data set (e.g., Izquierdo et 
al., 2008; Yiiksel et al., 2008; De Angelo, 2011). In addition to (vegetation) ground-truthing points, additional ancillary data (e.g., aerial 
photos or previous land cover maps and vector overlays such roads, rivers and populated places), if available, can be incorporated in a 
way subjected to selected classification method to improve the classification (Gao et al., 2006). For example, using an expert classification 
model which incorporates ancillary geo-referenced data (land use data, spatial texture and digital elevation model) the initial supervised 
classification is reclassified to provide high classification accuracy (Stefanov & Netzband, 2005; Yiiksel et al., 2008; Kahya et al., 2010). 

The band combination for image classification can best be selected using a quantitative method such as optimum index factor (OIF; 
Chavez, 1984). Using first and second ranks of the OIF calculation with inclusion of thermal band (e.g., band 6 in Landsat imagery) 
along with reflective (spectral) bands in data set may provide the best bands combination for classifications (Ehsani & Quiel, 2010). 
Other methods include first applying the principal component analysis to the bands in the imagery and then, implementing a supervised 
maximum likelihood classification approach on the principal components for classification (Gomez et al., 2005). Wavelet fusion concept 
has proved to be a highly efficient method to deal with data with different spatial resolutions (e.g., ASTER images with three different 
spatial resolutions) to convert the bands to the same spatial resolution before performing the land cover classification (Ranchin & Wald, 
2000; Bagan et al., 2008). These classifications are usually accomplished with the remote sensing software (e.g., ERDAS Imagine, RSI 
ENVI, PCI Geomatics, MultiSpec, ArcGIS extension: Image Analysis, open source GRASS GIS). Finally, the accuracy (overall accuracy, 
omission (i.e., producer’s accuracy) and commission (i.e., user’s accuracy) error, kappa index, e.g., Kappa Tool extension in ArcView 
3.x, using confusion matrix analysis) of these classified land cover types can be estimated using the test points collected from filed or 
referenced topographic map (e.g., Jensen, 1996; Yiiksel et al., 2008; Ehsani & Quiel, 2010; Kahya et al., 2010). 
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